Skip to content

Conversation

@rootfs
Copy link
Collaborator

@rootfs rootfs commented Sep 9, 2025

What type of PR is this?

This is a WIP to support more persistent storage for semantic cache

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #94 #95

Release Notes: Yes/No

@netlify
Copy link

netlify bot commented Sep 9, 2025

Deploy Preview for vllm-semantic-router ready!

Name Link
🔨 Latest commit c8e9266
🔍 Latest deploy log https://app.netlify.com/projects/vllm-semantic-router/deploys/68c20e7676ad2b00087dd040
😎 Deploy Preview https://deploy-preview-105--vllm-semantic-router.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@rootfs rootfs force-pushed the semcaching branch 2 times, most recently from 3eec72b to 9efcc06 Compare September 9, 2025 16:41
@github-actions
Copy link

github-actions bot commented Sep 9, 2025

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 config

Owners: @rootfs
Files changed:

  • config/cache/milvus.yaml
  • config/config.yaml

📁 src

Owners: @rootfs, @Xunzhuo, @wangchen615
Files changed:

  • src/semantic-router/pkg/cache/cache_factory.go
  • src/semantic-router/pkg/cache/cache_interface.go
  • src/semantic-router/pkg/cache/inmemory_cache.go
  • src/semantic-router/pkg/cache/milvus_cache.go
  • src/semantic-router/go.mod
  • src/semantic-router/go.sum
  • src/semantic-router/pkg/cache/cache.go
  • src/semantic-router/pkg/cache/cache_test.go
  • src/semantic-router/pkg/config/config.go
  • src/semantic-router/pkg/config/config_test.go
  • src/semantic-router/pkg/extproc/caching_test.go
  • src/semantic-router/pkg/extproc/router.go
  • src/semantic-router/pkg/extproc/test_utils_test.go
  • src/semantic-router/pkg/metrics/metrics.go

📁 Root Directory

Owners: @rootfs, @Xunzhuo
Files changed:

  • Makefile

vLLM

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

@rootfs
Copy link
Collaborator Author

rootfs commented Sep 9, 2025

CI failed due to missing running Milvus. For now, just skip these tests on CI

@rootfs
Copy link
Collaborator Author

rootfs commented Sep 9, 2025

@Xunzhuo No doc change in this PR. I'll add more doc on how to setup Milvus and inmemory caching in a following one.

@Xunzhuo Xunzhuo changed the title feat: Semantic Cache Refactoring and Milvus Persistent Storage Support feat: add milvus persistent storage support Sep 10, 2025
- Create CacheBackend interface with pluggable architecture
- Refactor existing in-memory cache to implement new interface
- Add cache factory pattern for backend selection
- Support configurable similarity thresholds and TTL
- Add comprehensive cache metrics and observability

Addresses vllm-project#94

Signed-off-by: Huamin Chen <[email protected]>
- Implement MilvusCache backend with persistent storage
- Add Milvus configuration file and connection management
- Support vector similarity search with configurable indexing
- Add TTL support and collection lifecycle management
- Include Milvus dependencies and build configuration

Addresses vllm-project#95

Signed-off-by: Huamin Chen <[email protected]>
Signed-off-by: Huamin Chen <[email protected]>
Signed-off-by: Huamin Chen <[email protected]>
Signed-off-by: Huamin Chen <[email protected]>
@rootfs
Copy link
Collaborator Author

rootfs commented Sep 11, 2025

merging this PR now. cc @Xunzhuo @aeft

@rootfs rootfs merged commit 3ce8a6e into vllm-project:main Sep 11, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Semantic Cache Refactoring: Support More VectorDB Backends

3 participants